Script 1 for Kitchel et al.Ā 2023 in prep taxonomic diversity manuscript.

library(tidyverse)
library(sp)
library(raster)
#library(rgeos)
library(rgbif)
library(viridis)
library(gridExtra)
library(rasterVis)
library(concaveman)
library(sf)
library(cowplot)
library(data.table)
set.seed(1)

Pull in compiled and cleaned data from FishGlob downloaded on November 28 2022 (V 1.5). This is typically compiled by Dr.Ā Aurore Maureaud. This includes public and private data and therefore link cannot be shared. However with editing you can run analyses for public trawl surveys.

Survey code Survey name short Survey name long Agency Region Access Provider/link to access Inclusion
AI Aleutian Islands Aleutian Islands National Oceanic and Atmospheric Administration USA Public DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080 Included
BITS-1 Baltic Sea Q1 Baltic Sea Quarter 1 International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
BITS-4 Baltic Sea Q4 Baltic Sea Quarter 4 International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
CHL Chile Chile Universidad de Concepción, Chile South America Requires data request Daniela Yepson and Luis Cubillos Included
COL Colombia Colombian Caribbean Universidad Nacional de Colombia South America Requires data request Camilo B. Garcia Too few years
DFO-HS Hecate Strait Hecate Strait Department of Fisheries and Oceans Canada Public https://open.canada.ca/data/en/dataset/780a1c02-1f9c-4994-bc70-a0e9ef8e3968 and OceanAdapt: https://zenodo.org/records/8103080 Too few years
DFO-NF Newfoundland Newfoundland Department of Fisheries and Oceans Canada Requires data request Mariano Koen-Alonso Included
DFO-QCS Queen Charlotte Sound Queen Charlotte Sound Department of Fisheries and Oceans Canada Public https://open.canada.ca/data/en/dataset/a278d1af-d567-4964-a109-ae1e84cbd24a and OceanAdapt: https://zenodo.org/records/8103080 Included
DFO-SOG Strait of Georgia Straight of Georgia Department of Fisheries and Oceans Canada Public https://open.canada.ca/data/en/dataset/d880ba18-8790-41a2-bf73-e9247380759b and OceanAdapt: https://zenodo.org/records/8103080 Too few years
DFO-WCHG West Coast Haida Gwaii West Coast Haida Gwaii Department of Fisheries and Oceans Canada Public https://open.canada.ca/data/en/dataset/5ee30758-b1d6-49fe-8c4e-5136f4b39ad1 and OceanAdapt: https://zenodo.org/records/8103080 Too few years
DFO-WCVI West Coast Vancouver Island West Coast Vancouver Island Department of Fisheries and Oceans Canada Public https://open.canada.ca/data/en/dataset/557e42ae-06fe-426d-8242-c3107670b1de and OceanAdapt: https://zenodo.org/records/8103080 Too few years
EBS Eastern Bering Sea Eastern Bering Sea National Oceanic and Atmospheric Administration USA Public DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080 Included
EVHOE Bay of Biscay Bay of Biscay International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
FALK Falkland Islands Falkland Islands Falkland Islands Fisheries Department Southern Ocean Requires data request Alexander Arkhipkin and Jorge Ramos Excluded after spatial temporal standardization in next script
FR-CGFS English Channel English Channel International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
GIN Guinea Guinea National Center of Fisheries Sciences of Boussoura, Conakry, Republic of Guinea Africa Requires data request Mohammed Lamine Camara Inconsistent sampling through space and time
GMEX-Summer Gulf of Mexico Summer Gulf of Mexico Summer National Oceanic and Atmospheric Administration USA Public DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080 Included
GMEX-Fall Gulf of Mexico Fall Gulf of Mexico Fall National Oceanic and Atmospheric Administration USA Public DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080 Included
GOA Gulf of Alaska Gulf of Alaska National Oceanic and Atmospheric Administration USA Public DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080 Included
GRL-DE Greenland Greenland Thuenen Institute of Sea Fisheries Europe Requires data request Karl-Michael Werner Included
GSL-N N Gulf of St.Ā Lawrence Northern Gulf of St.Ā Lawrence Department of Fisheries and Oceans Canada Public See OceanAdapt: https://zenodo.org/records/8103080 for specific DFO links Included
GSL-S S Gulf of St.Ā Lawrence Southern Gulf of St.Ā Lawrence Department of Fisheries and Oceans Canada Public https://open.canada.ca/data/en/dataset/1989de32-bc5d-c696-879c-54d422438e64 and OceanAdapt: https://zenodo.org/records/8103080 Included
ICE-GFS Iceland Iceland Marine and Freshwater Research Institute, Iceland Europe Requires data request Jón Sólmundsson Included
IE-IGFS Irish Sea Irish Sea International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
IS-TAU Israel Israel Tel Aviv University Asia Requires data request Jonathan Belmaker Too few years
IS-MOAG Israel Israel Israeli Ministry of Agriculture Asia Requires data request Oren Sonin and Dori Edelist Inconsistent sampling through space and time
MEDITS Mediterranean Mediterranean Multiple Europe Requires data request Contact corresponding author for contacts Included
MRT Mauritania Mauritania Institut Mauritanien de Recherches Océanographiques et des Pêches, Nouadhibou, Mauritania Africa Requires data request Beyah Meissa Inconsistent sampling through space and time
NAM Namibia Namibia National Marine Information and Research Centre, Ministry of Fisheries and Marine Resources, Namibia Africa Requires data request Johannes Kathena Included
NEUS-Fall NE US Fall Northeast USA Fall National Oceanic and Atmospheric Administration USA Public DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080 Included
NEUS-Spring NE US Spring Northeast USA Spring National Oceanic and Atmospheric Administration USA Public DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080 Included
NIGFS-1 N Ireland Q1 North Ireland Quarter 1 International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
NIGFS-4 N Ireland Q4 North Ireland Quarter 4 International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
Nor-BTS-3 Barents Sea Norway Q3 Barents Sea Norway Q3 International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
NS-IBTS-1 N Sea Q1 North Sea Quarter 1 International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
NS-IBTS-3 N Sea Q3 North Sea Quarter 3 International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
NZ-CHAT Chatham Rise NZ Chatham Rise New Zealand National Institute of Water and Atmospheric Research Limited, New Zealand Oceania Requires data request Richard O’Driscoll and Fabrice Stephenson Included
NZ-ECSI E Coast S Island NZ East Coast South Island New Zealand National Institute of Water and Atmospheric Research Limited, New Zealand Oceania Requires data request Richard O’Driscoll and Fabrice Stephenson Included
NZ-SUBA Sub-Antarctic NZ Sub-Antarctic New Zealand National Institute of Water and Atmospheric Research Limited, New Zealand Oceania Requires data request Richard O’Driscoll and Fabrice Stephenson Included
NZ-WCSI W Coast S Island NZ West Coast South Island New Zealand National Institute of Water and Atmospheric Research Limited, New Zealand Oceania Requires data request Richard O’Driscoll and Fabrice Stephenson Included
PT-IBTS Portugal Portugal International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
ROCKALL Rockall Plateau Rockall Plateau International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
S-GEORG S Georgia South Georgia British Antarctic Survey Southern Ocean Requires data request Mark Belchier and Martin Collins Included
SCS-Fall Scotian Shelf Fall Scotian Shelf Summer Department of Fisheries and Oceans Canada Public https://open.canada.ca/data/en/dataset/1366e1f1-e2c8-4905-89ae-e10f1be0a164 and OceanAdapt: https://zenodo.org/records/8103080 Too few years
SCS-SPRING Scotian Shelf Spring Scotian Shelf Spring Department of Fisheries and Oceans Canada Public https://open.canada.ca/data/en/dataset/fecf045a-95a2-4b69-8a40-818649a62716 and OceanAdapt: https://zenodo.org/records/8103080 Too much data loss after spatial temporal standardization
SCS-SUMMER Scotian Shelf Summer Scotian Shelf Summer Department of Fisheries and Oceans Canada Public https://open.canada.ca/data/en/dataset/1366e1f1-e2c8-4905-89ae-e10f1be0a164 and OceanAdapt: https://zenodo.org/records/8103080 Included
SEUS-fall SE US Fall Southeast USA Fall National Oceanic and Atmospheric Administration USA Public DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080 Included
SEUS-spring SE US Spring Southeast USA Spring National Oceanic and Atmospheric Administration USA Public DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080 Included
SEUS-summer SE US Summer Southeast USA Summer National Oceanic and Atmospheric Administration USA Public DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080 Included
SWC-IBTS-1 Scotland Shelf Sea Q1 Scotland Shelf Sea Quarter 1 International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
SWC-IBTS-4 Scotland Shelf Sea Q4 Scotland Shelf Sea Quarter 4 International Council for the Exploration of the Sea Europe Public https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx Included
WBLS Western Black Sea Western Black Sea Institute of Fish Resources, Bulgaria Europe Requires data request Elitsa Petrova (), Feriha Tserkova & Vesselina Mihneva Too few years
WCANN W Coast US West Coast USA National Oceanic and Atmospheric Administration USA Public DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080 Included
ZAF-ATL Atlantic Ocean ZA Atlantic Ocean South Africa Department of Forestry, Fisheries and the Environment, South Africa Africa Requires data request Tracey Fairweather Included
ZAF-IND Indian Ocean ZA Indian Ocean South Africa Department of Forestry, Fisheries and the Environment, South Africa Africa Requires data request Tracey Fairweather Included

FishGlob_1.5 <- fread(here::here("data","FISHGLOB_v1.5_clean.csv"))
|--------------------------------------------------|
|==================================================|
|--------------------------------------------------|
|==================================================|

This version of FishGlob leaves out seasons for GMEX, fix here

#add season to GMEX to survey unit

FishGlob_1.5[survey == "GMEX", survey_unit := paste0(survey,"-",season)]

Also adding in seasons for NIGFS

#add season to GMEX to survey unit

FishGlob_1.5[survey == "NIGFS", survey_unit := paste0(survey,"-",quarter)]

ZAF (South Africa) has distinct Atlantic and Indian surveys (split at ~20.01˚ E, Cape Agulhas)

FishGlob_1.5[survey == "ZAF" & longitude <20.01, survey_unit := "ZAF-ATL"][survey == "ZAF" & longitude >= 20.01, survey_unit := "ZAF-IND"]

Region names

sort(unique(FishGlob_1.5[,survey_unit]))
 [1] "AI"          "BITS-1"      "BITS-4"      "CHL"         "COL"         "DFO-HS"      "DFO-NF"      "DFO-QCS"     "DFO-SOG"     "DFO-WCHG"    "DFO-WCVI"    "EBS"         "EVHOE"       "FALK"       
[15] "FR-CGFS"     "GIN"         "GMEX-Fall"   "GMEX-Summer" "GOA"         "GRL-DE"      "GSL-N"       "GSL-S"       "ICE-GFS"     "IE-IGFS"     "IS-MOAG"     "IS-TAU"      "MEDITS"      "MRT"        
[29] norway_clean "NEUS-Fall"   "NEUS-Spring" "NIGFS-1"     "NIGFS-4"     "Nor-BTS"     "NS-IBTS-1"   "NS-IBTS-3"   "NZ-CHAT"     "NZ-ECSI"     "NZ-SUBA"     "NZ-WCSI"     "PT-IBTS"     "ROCKALL"    
[43] "S-GEORG"     "SCS-FALL"    "SCS-SPRING"  "SCS-SUMMER"  "SEUS-fall"   "SEUS-spring" "SEUS-summer" "SWC-IBTS-1"  "SWC-IBTS-4"  "WBLS"        "WCANN"       "WCTRI"       "ZAF"         "ZAF-ATL"    
[57] "ZAF-IND"    

##Data Replacements ####Greenland (version in FishGlob 1.5 is missing lengths and therefore biomass values) This version was obtained directly from Karl-Michael Werner karl-michael.werner@thuenen.de who now manages the Greenland survey September 2023. He is based in Germany.

#greenland <- 

####Norway Prepped by Laurene Pecuchet (U Trƶmso, Norway) September 2023 to replace what’s in FishGlob 1.5 because IMR ā€œare quite concerned that FishGlob, and other studies, have been using aā€flawedā€ multi-surveys dataset that is available in NMDC (data portal of IMR). Turns out that this dataset was put publicly by miscommunication on NMDC after one published paper in Scientific Reports, and I think they only realized the existence of this dataset just the last year as some papers are coming out using it (especially the one from Cesc Gordo-Vilaseca in PNAS https://www.pnas.org/doi/10.1073/pnas.2120869120). They are now trying to make some damage controls to make sure that this dataset is not used ever again in the future, but that cleanded and standardised datasets of the Barents Sea survey that are publicly available in NMDC are used instead of.

September 14: From Laurene, ā€œI send you in attachment the ā€œnewā€ IMR survey formatted for Fishglob. I have done some small check of the dataset, and so far everything looks good, but I didn’t do a deep check yet, but I don’t see why there should be any problems with it….For your study, I think it is also important that you know that there has been some inconsistencies in taxonomic descriptions in the Barents Sea so that some species should be considered at the genus level instead of for biodiversity analysis, I send you in attach an excel (Barents Sea Fish Reference List.csv) file that summarize which species might be a misidentification and which one should be considered and merged.ā€ All of these files now live in ā€œdata/Norway_Sep2023ā€

Helpful guidance from here: https://www.hi.no/en/hi/nettrapporter/rapport-fra-havforskningen-en-2021-15 - ā€œ2.2.5 - Recommended adjustments to the output before analysis Eelpouts and liparids. When combing years, we recommend that all records of eelpouts (Zoarcidae) are pooled to the family level, because they are notoriously difficult to identify (see Appendix 3). The same apply to liparids (Liparidae). If species level data of these families are used, consider excluding data from 2004-2006/2007. These years the staff on some of the Norwegian vessels were inexperienced, and proper identification keys for arctic species were lacking (compare for instance catches of Lycodes frigidus and Lycodes eudipleurostictus in the first years to the later years, Appendix 3). If species level data of these families are used, records to family levels should be removed or else these will be treated as a separate species in the further analysis of the data. Both Zoarcidae and Liparidae have unresolved taxonomy for some genera, therefore we have chosen to pool all liparids of the genus Careproctus and all eelpouts of the genus Gymnelus in the output. Sebastes. The columnā€ Sebastes spp.ā€ contains mainly juvenile redfish. Small specimens are very difficult to identify so the protocol is to identify only individuals larger than 10 cm to the species level. Before analysis, all redfish ( S . mentella , S. norvegicus, S. viviparus and Sebastes spp .) should be pooled, or Sebastes spp. should be removed – if not it will be treated as a separate species in the analysis . Records in Appendix 2. The records of the S. viviparus west of Svalbard(Spitsbergen) are unreliable and should be removed if Sebastes data are kept at the species level (Appendix 2). Species verified for the Barents Sea, but outliers in terms the normal depth range, distribution area within the Barents Sea, size etc. were coded as questionable in the data base (Appendix 2) and should be removed before analysis. Consider also removing pelagic species (e.g.Ā capelin and herring), as these are poorly sampled by the bottom trawl. The data should be standardised with towing distance before analysis.ā€

Therefore, we will: - Remove all records of eelpouts and liparids (Family = Zoarcidae or Liparidae) (as we only include species ID’d to species) - Remove redfish (Genus = Sebastes)


#load Norwegian data
load(here::here("data","Norway_Sep2023","NOR-BTS_clean.RData"))
norway_clean <- data.table(data)

#remove observations without dates
norway_clean <- norway_clean[complete.cases(norway_clean[,.(month)]),]

#remove species records in accordance with recommendation from HI
norway_clean <- norway_clean[!(family %in% c("Zoarcidae","Liparidae") | genus == "Sebastes"),]

#some column names don't match fishglob (fishglob = num, num_h, num_cpue, wgt, wgt_h, wgt_cpue; norway = num, num_cpue (number of ind./hour), num_cpua (number of ind./km2), wgt, wgt_cpue (kg/min), wgt_cpua(kg/km2)  )
#also, some column units in the readme are in correct. Therefore, I will generate _cpue and _h values here
# we will need to check  and rename columns
setnames(norway_clean, c("haul_dur"), c("haul_dur_m"))
norway_clean[,haul_dur := haul_dur_m/60] #haul duration currently in minutes, need hours
norway_clean[,num_h := num/haul_dur][,num_cpue := num/area_swept][,wgt_h := wgt/haul_dur][,wgt_cpue := wgt/area_swept]

#change some columns to numeric
cols = c("month","day")
norway_clean[,(cols) := lapply(.SD,as.numeric),.SDcols = cols]

#also, delete source and timestamp
fishglob_colnames <- colnames(FishGlob_1.5)
norway_clean <- norway_clean[,..fishglob_colnames]

norway_clean[survey == "Nor-BTS" & month %in% c(1:6), survey_unit := "Nor-BTS-1"][survey == "Nor-BTS" & month %in% c(7:12), survey_unit := "Nor-BTS-3"]

#Overlap between IBTS and Nor-BTS surveys below 62˚latitude, so delete all hauls that occur below 62˚latitude
norway_clean <- norway_clean[latitude  >= 62,]

Delete Greenland and Norway

FishGlob_1.5 <- FishGlob_1.5[!(survey %in% c("Nor-BTS"
                                             #,
                                             #"GRL-DE" #ignore greenland for now...
                                             ))]

Add in updated Greenland and Norway data

FishGlob_1.5 <-rbind(FishGlob_1.5,norway_clean)
#FishGlob_1.5 <-rbind(FishGlob_1.5,greenland)

##Preliminary Data Cuts ###Specific Regional Changes Before Cutting to 10 years only

GSL - North: we have data 1980-2019, but gear changes in 2004/2005, so let’s use later portion (more consistent months of sampling; 2005-2019; 15 years) - South: we have data 1970-2019, but gear/vessel changes in 1985 and again in 1992, so again let’s use later portion (1992-2019; 27 years) - See this github issue

#identify haul_ids of hauls we should remove from GSL surveys
haul_ids_to_remove_GSL <- unique(FishGlob_1.5[(survey == "GSL-N" & year < 2005)|(survey == "GSL-S" & year < 1992),haul_id])

FishGlob_1.5 <- FishGlob_1.5[!(haul_id %in% haul_ids_to_remove_GSL),] #remove hauls before consistent gear/vessel was used

SGEORG - From Martin Collins, ā€œMost surveys were focused on demersal fish on the South Georgia shelf (< 350 m), but surveys in 2003, 2010 and 2019 had some deeper trawls. The deeper trawls caught very different fish, so are unlikely to be of use to a long-term analysis, but I have left them in.ā€

-Delete all trawls deeper than 350 M

#identify haul_ids of hauls we should remove from GSL surveys
haul_ids_to_remove_SGEORG <- unique(FishGlob_1.5[(survey == "SGEORG" & depth >350),haul_id])

FishGlob_1.5 <- FishGlob_1.5[!(haul_id %in% haul_ids_to_remove_SGEORG),] #remove hauls before consistent gear/vessel was used

NZ-CHAT -bump december observations to next year because observations occur in 12,1,2

#bump observations forward
FishGlob_1.5[survey == "NZ-CHAT" & month == 12,  year := year+1, ]

###Because time is an essential component of these analyses, we will get rid of any survey x season combinations that are not sampled for at least 10 years

#new row for total number of years sampled
FishGlob_1.5[,years_sampled := length(unique(year)),.(survey_unit)]

summary(FishGlob_1.5$years_sampled) #ranges from 2 (DFO Straight of Georgia) to 57 (Northeast US)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00   23.00   29.00   30.89   37.00   57.00 
View(unique(FishGlob_1.5[,.(survey_unit, years_sampled)]))

#statistics about full dataset
nrow(FishGlob_1.5) 
[1] 4172996
length(unique(FishGlob_1.5[,survey])) 
[1] 45
length(unique(FishGlob_1.5[,survey_unit])) 
[1] 57
#remove observations for any regions x season combinations sampled less than 10 times
FishGlob.10year <- FishGlob_1.5[years_sampled >= 10,]

#statistics about reduced 10 year dataset
nrow(FishGlob.10year) 
[1] 4089112
length(unique(FishGlob.10year[,survey])) 
[1] 38
length(unique(FishGlob.10year[,as.character(survey_unit)])) 
[1] 48
#remove full database
rm(FishGlob_1.5)

###For taxonomic analyses, resolution to species is required. Therefore, we will exclude any observations not resolved to species.

#month a number
FishGlob.10year[,month := as.numeric(month)]

FishGlob.10year.spp <- FishGlob.10year[rank %in% c("Species", "Subspecies"),] #3869384 total observations

#remove full species database
rm(FishGlob.10year)

#vector with all survey names
all_survey_units <- sort(unique(FishGlob.10year.spp[,survey_unit]))

#calculate # species per year
FishGlob.10year.spp_survey_year <- unique(FishGlob.10year.spp[,.(survey_unit, year, accepted_name)])

FishGlob.10year.spp_survey_year[,spp_count_survey_year := uniqueN(accepted_name),.(survey_unit, year)]

FishGlob.10year.spp_survey_year.r <-unique(FishGlob.10year.spp_survey_year[,.(survey_unit,  year, spp_count_survey_year)])

nrow(FishGlob.10year.spp_survey_year.r)
[1] 1215
#calculate # hauls per year
FishGlob.10year.spp_haulid_year <- unique(FishGlob.10year.spp[,.(survey_unit, year, haul_id)])

FishGlob.10year.spp_haulid_year[,haulid_count_survey_year := uniqueN(haul_id),.(survey_unit, year)]

FishGlob.10year.spp_haulid_year.r <-unique(FishGlob.10year.spp_haulid_year[,.(survey_unit,  year, haulid_count_survey_year)])

nrow(FishGlob.10year.spp_haulid_year.r)
[1] 1215

##Visually Inspect Distribution of Data Through Time and Space

##Spatial and Temporal Patterns in All Trawl Surveys

Let’s look at the number of hauls per year/month and year/quarter and year/season visually

#unique survey, survey_unit, year, month, quarter, season, haul_id, lat, lon
FishGlob.10year.uniquehauls <- unique(FishGlob.10year.spp[,.(survey, survey_unit, year,month,quarter,season,haul_id, latitude, longitude,haul_dur)])

#add column with adjusted longitude for few surveys that cross dateline (NZ-CHAT and AI)
FishGlob.10year.uniquehauls[,longitude_adj := ifelse((survey_unit %in% c("AI","NZ-CHAT") & longitude > 0),longitude-360,longitude)]

FishGlob.10year.uniquehauls[,haul_counts_per_survey_season_month :=uniqueN(haul_id),.(survey, month, season)][, #count # hauls per survey, season, and month
                     haul_counts_per_survey_quarter_month :=uniqueN(haul_id),.(survey, month, quarter)][,#count # hauls per survey, month, and quarter
                     total_hauls_survey :=uniqueN(haul_id),.(survey)][,#count # hauls per survey in all years
                                                        
              #proportion of hauls for each survey, season, and month divided by total # over all years
                     haul_proportion_survey_season :=haul_counts_per_survey_season_month/total_hauls_survey][,
              #proportion of hauls for each survey, quarter, and month divided by total # over all years
                     haul_proportion_survey_quarter :=haul_counts_per_survey_quarter_month/total_hauls_survey][,
                                                                                                               
                     haul_count_per_survey_year_month :=uniqueN(haul_id),.(year, survey_unit, month)][, #count # hauls per survey unit, year, and month
                     total_hauls_survey_year := uniqueN(haul_id),.(survey_unit,year)][, #count total # hauls per survey unit and year
                     #proportion of hauls for each survey unit and month divided by total # hauls within a survey unit within a year
                     haul_proportion_month_yearly := haul_count_per_survey_year_month/total_hauls_survey_year][, 

                     haul_count_per_survey_year_quarter :=uniqueN(haul_id),.(year, survey_unit, quarter)][, #count # hauls per survey unit, year, and month
                     #proportion of hauls for each survey unit and month divided by total # hauls within a survey unit within a year
                     haul_proportion_quarter_yearly := haul_count_per_survey_year_quarter/total_hauls_survey_year] 

FishGlob.10year.uniquehauls.season <- unique(FishGlob.10year.uniquehauls[,.(survey, survey_unit, month, season, haul_counts_per_survey_season_month,total_hauls_survey, haul_proportion_survey_season)]) #relative sampling by season across all years

FishGlob.10year.uniquehauls.quarter <- unique(FishGlob.10year.uniquehauls[,.(survey,survey_unit , month, quarter, haul_counts_per_survey_quarter_month,total_hauls_survey, haul_proportion_survey_quarter)]) #relative sampling by quarter across all years

FishGlob.10year.uniquehauls.annual.month <- unique(FishGlob.10year.uniquehauls[,.(survey, year, survey_unit, month, haul_count_per_survey_year_month,total_hauls_survey_year,haul_proportion_month_yearly)]) #relative sampling by month within years

FishGlob.10year.uniquehauls.annual.quarter <- unique(FishGlob.10year.uniquehauls[,.(survey, year, survey_unit, quarter, haul_count_per_survey_year_quarter,total_hauls_survey_year,haul_proportion_quarter_yearly)]) #relative sampling by month within years

#how does #hauls vary with season and month?
survey_season_month_hauls <- ggplot(FishGlob.10year.uniquehauls.season) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  facet_wrap(~survey,scales = "free_y") +
  theme_classic()

ggsave(survey_season_month_hauls, filename = "survey_season_month_hauls.pdf",path = here::here("figures","view_data"), height = 5, width = 15, units = "in")

#how does #hauls vary with quarter and month?
survey_quarter_month_hauls <- ggplot(FishGlob.10year.uniquehauls.quarter) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  facet_wrap(~survey,scales = "free_y") +
  theme_classic()

ggsave(survey_quarter_month_hauls, filename = "survey_quarter_month_hauls.pdf",path = here::here("figures","view_data"), height = 5, width = 15, units = "in")

#how does #hauls vary with year and month?
year_survey_month_hauls <- ggplot(FishGlob.10year.uniquehauls.annual.month) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  facet_wrap(~survey_unit,scales = "free_y") +
  theme_classic()

ggsave(year_survey_month_hauls, filename = "year_survey_month_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
ggsave(year_survey_month_hauls, filename = "year_survey_month_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")

#how does #hauls vary with year and month?
year_survey_quarter_hauls <- ggplot(FishGlob.10year.uniquehauls.annual.quarter) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  facet_wrap(~survey_unit,scales = "free_y") +
  theme_classic()

ggsave(year_survey_quarter_hauls, filename = "year_survey_quarter_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
ggsave(year_survey_quarter_hauls, filename = "year_survey_quarter_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")

Now, let’s look at how location of sampling varies by month of sampling and year of sampling

location_by_year <- ggplot(FishGlob.10year.uniquehauls) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  facet_wrap(~survey_unit, scales = "free") +
  theme_classic()

ggsave(location_by_year, filename = "location_by_year.pdf",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_year, filename = "location_by_year.jpg",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_year, filename = "location_by_year.eps",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")
(location_by_month <- ggplot(FishGlob.10year.uniquehauls) +
  geom_point(aes(x = longitude_adj, y = latitude, color = as.numeric(month)), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  facet_wrap(~survey_unit, scales = "free") +
  theme_classic())

ggsave(location_by_month, filename = "location_by_month.pdf",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_month, filename = "location_by_month.jpg",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_month, filename = "location_by_month.eps",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

##Region Specific Data Processing

-Fredston et al.Ā 2022 Nature and Batt et al.Ā 2017 Ecology Letters informed North American data processing -Personal communication with Aurore Maureaud and Laurene Pecuchet re: work by A. Maureaud, L. Pecuchet and R. Frelat and the supplementary material for Maureaud et al.Ā 2019 Proceedings of the Royal Society B: Biological Sciences informed European data processing -Additional data processing informed by data itself, and by FishGlob pdf summary documents -limit to max 3 months for each survey unit, representative of a ā€˜season’ (exception = West Coast USA where all 4 months sampled consistently)

####ā€œAIā€

ggplot(FishGlob.10year.uniquehauls.season[survey == "AI",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey == "AI",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey == "AI",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey == "AI",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey == "AI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey == "AI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "AI",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "AI",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

ai_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "AI" & month %in% c(6:8),haul_id])

####BITS (We have two surveys for BITS, quarter 1 and quarter 4) BITS 1

From Fredston et al.Ā 2023, every year after 2000 has >400 hauls and most of the earlier years are <50

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "BITS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "BITS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep both months (2,3) -Seemingly consistent spatial distribution through time -Consistent # of species and # hauls after 2000

bits1_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "BITS-1" & month %in% c(2,3) & year > 2000,haul_id])

BITS4

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "BITS-4",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "BITS-4",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep (10,11,12) -Start in 2000 (starts in 1996, but gap in 1997 and 1998, and 1996 all in December; also spp richness in first survey very low; consistent # of hauls after 2000) -Seemingly consistent spatial distribution through time

bits4_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "BITS-4" & month %in% c(10:12) & year > 2000,haul_id])

####CHL (Chile)

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "CHL",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "CHL",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "CHL",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "CHL",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "CHL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "CHL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "CHL",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "CHL",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep (7,8,9) -Seemingly consistent spatial distribution through time -No major changes in spp richness through time -No major changes in # hauls through time

chl_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "CHL" & month %in% c(7:9),haul_id])

####DFO-NF

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "DFO-NF",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "DFO-NF",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep (10,11,12) -Seemingly consistent spatial distribution through time -No major changes in spp richness through time -No major changes in haulid through time

dfo_nf_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF" & month %in% c(10:12),haul_id])

####DFO-QCS

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "DFO-QCS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "DFO-QCS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep (7,8) -Seemingly consistent spatial distribution through time -No major changes in richness over time -No major changes in #hauls

dfo_qcs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS" & month %in% c(7,8),haul_id])

####EBS

-Sampling years prior to 1984 (data begin in 1982) were excluded from analysis due to large apparent increases in the number of species recorded in the first two years. (Batt et al.Ā 2017)

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "EBS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "EBS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "EBS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "EBS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "EBS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "EBS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "EBS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "EBS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep (6,7,8) -Seemingly consistent spatial distribution through time -Per Batt et al.Ā 2017, limit to >= 1984 -No clear changes in richness through time -No clear changes in # hauls through time

ebs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "EBS" & month %in% c(6,7,8) & year >= 1984,haul_id])

####EVHOE

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "EVHOE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "EVHOE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "EVHOE",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "EVHOE",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep (10,11,12) -Seemingly consistent spatial distribution through time -Very low sampling in 2017 (and also low richness), exclude this year

evhoe_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "EVHOE" & month %in% c(10,11,12) & year != 2017 ,haul_id])

####FALK (excluded from final dataset)

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "FALK",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "FALK",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "FALK",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "FALK",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "FALK",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "FALK",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "FALK",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "FALK",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep February (2) only from 2004 onward (most consistent sampling) -Inconsistent spatial distribution through time, but this will be fixed in next step with spatial standardization

falk_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "FALK" & month %in% c(2) & year >= 2004, haul_id])

####FR-CGFS

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "FR-CGFS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "FR-CGFS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep 9,10,11 -Consistent spatial distribution through time -Seemingly consistent richness through time -Seeemingly consistent #hauls through time

fr_cgfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS" & month %in% c(9,10,11), haul_id])

####GIN (excluded from final dataset)

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GIN",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GIN",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GIN",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GIN",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GIN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GIN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GIN",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GIN",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Exclude this region, no consistent sampling through time

gin_hauls_keep <- NULL

####GMEX -In the Gulf of Mexico, we restricted our analysis to data from 1984 - 2000 (full range 1982-2014); if all years had been used, the number of sites sampled in at least 85% of years would drop from 39 to 13. (Batt et al.Ā 2017)

GMEX Fall

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GMEX-Fall",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GMEX-Fall",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep 9,10,11 -Inconsistent spatial distribution through time, will restrict to <-87.5 longitude -Seemingly consistent richness through time -Seeemingly consistent #hauls through time

gmex_fall_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall" & month %in% c(9,10,11) & longitude_adj < -87.5, haul_id])

GMEX Summer

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GMEX-Summer",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GMEX-Summer",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep months 5,6,7 -In consistent spatial distribution through time, but this will be fixed in spatial standardization step -Seemingly consistent richness before 2008 and 2008 onward through time -Seeemingly consistent #hauls through time -Jump from 2007 to 2008, when spatial footprint increases, so I will only use data from before 2008

gmex_summer_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer" & month %in% c(5,6,7) & year <2008, haul_id])

####GOA

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GOA",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GOA",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GOA",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GOA",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GOA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GOA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GOA",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GOA",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep months 6,7,8 -Consistent spatial distribution through time -Seemingly consistent richness -Seemingly consistent #hauls through time

goa_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GOA" & month %in% c(6,7,8), haul_id])

####GRL-DE -From Beukhof et al.Ā 2019, all surveys in October and November

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GRL-DE",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GRL-DE",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-No months in data set, but according to Beukhof et al.Ā 2019, all sampling in October and November so keep all -Consistent spatial distribution through time -Seemingly consistent richness -# of hauls drops between 1991 and 1992, and both 1992 and 2017 so limit to years between (1993-2016)

grl_de_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE" & year %in% c(1993:2016), haul_id])

####GSL

GSL-N

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-N",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-N",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GSL-N",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GSL-N",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep 6,7,8 -Consistent spatial distribution through time -Seemingly consistent richness -# of hauls in 2005 is higher, so start in 2006

gsl_n_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GSL-N" & year > 2005, haul_id])

GSL-S

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-S",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-S",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GSL-S",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GSL-S",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep 8,9,10 -Consistent spatial distribution through time -Seemingly consistent richness -Seemingly consistent number of hauls

gsl_s_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GSL-S" & month %in% c(8:10), haul_id])

####ICE-GFS

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ICE-GFS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ICE-GFS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep 2,3,4 -Consistent spatial distribution through time -Seemingly consistent richness -Seemingly consistent number of hauls

ice_gfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS" & month %in% c(2:4), haul_id])

####IE-IGFS

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "IE-IGFS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "IE-IGFS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep 10,11,12 -Consistent spatial distribution through time after 2004 (sampled far east in 2003 and 2004) -Seemingly consistent richness -Seemingly consistent number of hauls

ie_igfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS" & month %in% c(10:12) & year  > 2004, haul_id])

####IS-MOAG (excluded from final dataset)

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "IS-MOAG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "IS-MOAG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "IS-MOAG",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "IS-MOAG",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Sampling too scattered over time, excluding

is_moag_hauls_keep <- NULL

####MEDITS

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "MEDITS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "MEDITS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "MEDITS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "MEDITS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep all surveys in quarter 2 -Consistent spatial distribution through time -Seemingly consistent richness -Seemingly consistent number of hauls

medits_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "MEDITS", haul_id])

####MRT (excluded from final dataset)

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "MRT",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "MRT",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "MRT",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "MRT",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "MRT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "MRT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "MRT",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "MRT",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Sampling inconsistent, exclude completely

mrt_hauls_keep <- NULL

####NAM

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NAM",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NAM",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NAM",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NAM",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NAM",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NAM",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NAM",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NAM",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep surveys in 1 and 2 (most consistently sampled) -Consistent spatial distribution through time -Seemingly consistent richness except for 1998 (exclude) -Seemingly consistent number of hauls except for 1998 (exclude)

nam_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NAM" & month %in% c(1,2) & year != 1998, haul_id])

####NEUS

NEUS Spring

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NEUS-Spring",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NEUS-Spring",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep 3,4,5 months -Inconsistent spatial distribution through time, but should be caught in standardization step -Seemingly consistent richness (especially after 87, should be fixed with standardization step) -Seemingly consistent number of hauls (especially after 81, should be fixed with standardization step)

#calculate wgt_cpue (km^2 avg from sean Lucey) and wgt_h (all biomass values calibrated to standard pre 2009 30 minute tow)
FishGlob.10year.spp[survey == "NEUS", wgt_h := wgt/0.5][survey == "NEUS", wgt_cpue := wgt/0.0384][survey == "NEUS", num_h := num/0.5][survey == "NEUS", num_cpue := num/0.0384]


#also, for northeast, we are going to delete any hauls before 2009 that are outside of +/- 5 minutes of 30 minutes and 2009 forward that are outside of +/- 5 minutes of 20 minutes
neus_spring_keep <- unique(FishGlob.10year.uniquehauls[((survey_unit == "NEUS-Spring" & month %in% c(3:5) & year < 2009 & (haul_dur > 0.42 & haul_dur < 0.58)) |
                                                        (survey_unit == "NEUS-Spring" & month %in% c(3:5) & year >= 2009 & (haul_dur > 0.25  & haul_dur < 0.42))), haul_id])

NEUS Fall

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NEUS-Fall",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NEUS-Fall",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep 9,10,11 months -Inconsistent spatial distribution through time, but should be caught in standardization step -Seemingly consistent richness (especially after 84, should be fixed with standardization step) -Seemingly consistent number of hauls (especially after 85, should be fixed with standardization step)


#also, for northeast, we are going to delete any hauls before 2009 that are outside of +/- 5 minutes of 30 minutes and 2009 forward that are outside of +/- 5 minutes of 20 minutes
neus_fall_keep <- unique(FishGlob.10year.uniquehauls[((survey_unit == "NEUS-Fall" & month %in% c(9,10,11) & year < 2009 & (haul_dur > 0.42 & haul_dur < 0.58)) |
                                                        (survey_unit == "NEUS-Fall" & month %in% c(9,10,11) & year >= 2009 & (haul_dur > 0.25  & haul_dur < 0.42))), haul_id])

####NIGFS Northern Ireland

Spring Northern Ireland (quarter 1)

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NIGFS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NIGFS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep 2,3,4 months -Inconsistent spatial distribution through time, but should be caught in standardization step -Seemingly consistent richness -Seemingly consistent number of hauls

nigfs_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1" & month %in% c(2,3,4), haul_id])

Spring Northern Ireland (quarter 1)

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NIGFS-4",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NIGFS-4",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Keep 10,11 months -Consistent spatial distribution through time, but should be caught in standardization step -Seemingly consistent richness -Seemingly consistent number of hauls

nigfs_4_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4" & month %in% c(10,11), haul_id])

####Nor-BTS

OG FISHGLOB includes Nor-BTS-1 as well, but this was not shared by L. Pecuchet, and therefore ignored

Nor-BTS-3

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "Nor-BTS-3",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "Nor-BTS-3",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 8,9,10 -Somewhat consistent spatial distribution through time -Number of hauls is variable, but no clear years to exclude -Laurene Pecuchet (U Tromso) told us that only surveys 2004 and onwards work for biodiversity analyses

nor_bts_3_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3" & month %in% c(8:10) & year >= 2004, haul_id])

####NS-IBTS

NS-IBTS-1

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NS-IBTS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NS-IBTS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 1,2,3 -Consistent spatial distribution through time -Linear increase in richness, cutoff on # hauls more clear -Linear increase, but somewhat clear break between late 70s and mid-80s, only keep hauls after 1984

ns_ibts_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1" & month %in% c(1:3) & year >= 1984, haul_id])

NS-IBTS-3

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NS-IBTS-3",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NS-IBTS-3",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 7,8,9 -Consistent spatial distribution through time -Consistent richness through time -Early years lower # hauls, will start at 1998

ns_ibts_3_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3" & month %in% c(7:9) & year >= 1998, haul_id])

####NZ

NZ-CHAT

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-CHAT",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-CHAT",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 12,1,2 (NOTE THAT THIS NZ-CHAT SURVEY CROSSES YEAR, SO WE ALREADY LUMPED 12 with NEXT year) -Consistent spatial distribution through time -Seemingly consistent richness -Seemingly consistent number of hauls after 1995

nz_chat_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT" & month %in% c(12,1,2) & year >= 1995, haul_id])

NZ-ECSI

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-ECSI",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-ECSI",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 4,5,6 -Consistent spatial distribution through time -Seemingly consistent richness -Seemingly consistent number of hauls -Gap between 1995 and 2005, but we have 10 total years so we’ll keep for now

nz_ecsi_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI" & month %in% c(4,5,6), haul_id])

NZ-SUBA

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-SUBA",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-SUBA",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 11 and 12 -Consistent spatial distribution through time -Seemingly consistent richness -Far more hauls in 1990s, these early sampling years will be excluded (start in 2000)

nz_suba_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA" & month %in% c(11,12) & year >= 2000, haul_id])

NZ-WCSI

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-WCSI",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-WCSI",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 3,4 -Consistent spatial distribution through time -Seemingly consistent richness -Linear decrease in # of hauls through time, leave out first two years with highest # hauls (>= 1995)

nz_wcsi_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI" & month %in% c(3,4) & year >= 1995, haul_id])

####PT-IBTS PT-IBTS

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "PT-IBTS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "PT-IBTS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 9,10,11 -Consistent spatial distribution through time -Seemingly consistent richness -Seemingly consistent number of hauls

pt_ibts_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS" & month %in% c(9,10,11), haul_id])

####ROCKALL

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ROCKALL",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ROCKALL",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 8,9 -Consistent spatial distribution through time -Seemingly consistent richness -Seemingly consistent number of hauls

rockall_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL" & month %in% c(8,9), haul_id])

####S-GEORG

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "S-GEORG",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "S-GEORG",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 1 and 2 -Consistent spatial distribution through time -Seemingly consistent richness except for 2003, will be excluded -Seemingly consistent number of hauls, except for 2012, will be excluded

s_georg_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG" & month %in% c(1,2) & !(year %in% c(2003,2012)), haul_id])

####SCS

Spring

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SCS-SPRING",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SCS-SPRING",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 2,3,4 -Inconsistent spatial distribution through time (northern latitudes only sampled in early years), only include longitudes < -62 and latitudes < 45.5 -Seemingly consistent richness -Number of hauls is variable, exclude super low and high numbers (1985,1994,2015,2019)

scs_spring_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING" & month %in% c(2,3,4) & !(year %in% c(1985,1994,2015,2019)) & longitude_adj < -62 & latitude < 45.5, haul_id])

SUMMER

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SCS-SUMMER",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SCS-SUMMER",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 6,7,8 -Consistent spatial distribution through time -Richness increases linearly, not a clear break point, using breakpoint from # of hauls, but will exclude 2010 which has a very high richness -# Hauls increases linearly from ~120 in 1970 to ~220 in 2020, not a clear breakpoint, but will go with 1986 because there is a jump between 85 and 86 -Gear change in 1983 (Ellingsen et al.Ā 2015)

scs_summer_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER" & month %in% c(6,7,8) & year >= 1986 & year != 2010, haul_id])

###SEUS

Spring

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-spring",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-spring",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 4,5,6 -Consistent spatial distribution through time -Consistent richness through time -# Hauls low in 1989 and 2018, will exclude

seus_spring_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring" & month %in% c(4,5,6) & year != 1989 & year != 2018, haul_id])

Summer

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-summer",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-summer",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 7,8 -Consistent spatial distribution through time -Richness consistent through time -# Hauls low in first year, otherwise okay, just exclude first year (1989)

seus_summer_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer" & month %in% c(7,8) & year != 1989, haul_id])

Fall

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-fall",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-fall",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 9,10,11 -Consistent spatial distribution through time -Richness consistent through time -# Hauls low in first year, otherwise okay, just exclude first year (1989)

seus_fall_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall" & month %in% c(9,10,11) & year != 1989, haul_id])

####SWC-IBTS

Scotland Shelf Sea

SWC-IBTS 1

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SWC-IBTS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SWC-IBTS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 1,2,3 -Somewhat inconsistent spatial distribution through time, but this should be addressed in spatial standardization procedure -Richness consistent through time -# Hauls consistent except low in 1995, just exclude 1995

swc_ibts_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1" & month %in% c(1,2,3) & year != 1995, haul_id])

SWC-IBTS 4

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SWC-IBTS-4",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SWC-IBTS-4",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Use months 10,11,12 -Somewhat inconsistent spatial distribution through time (southern latitudes only sampled in early years), but this should be addressed in spatial standardization procedure -Richness consistent through time (especially after mid 90s) -# Hauls consistent except low before 1995 and low in 2013, exclude these

swc_ibts_4_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4" & month %in% c(10,11,12) & year != 1995 & year >= 1995, haul_id])

####WCANN

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "WCANN",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "WCANN",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "WCANN",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "WCANN",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "WCANN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "WCANN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "WCANN",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "WCANN",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Here, one exception, will use four months (6,7,8,9) because all sampled consistently, and lower latitude areas sampled later in the summer consistently -Consistent spatial distribution through time -Richness consistent through time -# Hauls consistent through time

wcann_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "WCANN" & month %in% c(6:9), haul_id])

####WCTRI -Exclude because only 10 years and overlaps somewhat wiith WCANN

wctri_keep <- NULL

####ZAF

ATL

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ZAF-ATL",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ZAF-ATL",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Include 1,2,3 -Consistent spatial distribution through time -Richness consistent through time -# Hauls consistent through time after 1991

zaf_atl_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL" & month %in% c(1:3) & year >= 1991, haul_id])

IND

ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()


ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()


# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ZAF-IND",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()


# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ZAF-IND",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

-Include 4,5,6 -Consistent spatial distribution through time -Richness consistent through time -# Hauls consistent before 2001, and then also in 2005 and 2009-2010

zaf_ind_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND" & month %in% c(4:6) & year %in% c(1985:2001,2005, 2009,2010), haul_id])

####Combine all lists that have _keep

#all objects with _keep
list_obj <- ls(pattern = "_keep")

#combine
fishglob_haulids_to_keep <- unlist(lapply(list_obj, get)) #229894 hauls (Started with 278405)

FishGlob.10year.spp_manualclean <- FishGlob.10year.spp[haul_id %in% fishglob_haulids_to_keep,]

#Require latitude and longitude for all observations
FishGlob.10year.spp_manualclean <- FishGlob.10year.spp_manualclean[complete.cases(FishGlob.10year.spp_manualclean[,.(latitude, longitude)])] #check that this works

#another check for # years sampled
#new row for total number of years sampled
FishGlob.10year.spp_manualclean[,years_sampled := length(unique(year)),.(survey_unit)]
View(unique(FishGlob.10year.spp_manualclean[,.(survey_unit, years_sampled)]))

#save
saveRDS(FishGlob.10year.spp_manualclean, file = here::here("data","cleaned","FishGlob.10year.spp_manualclean.rds"))

####Some surveys sample through end of year, fix these -NOTE THAT THIS NZ-CHAT SURVEY CROSSES YEAR, SO LUMP 1 and 2 with previous year

---
title: "Prepare FishGlob Dataset"
output: html_notebook
author: Zoë J. Kitchel
date: October 11, 2023
---

Script 1 for Kitchel et al. 2023 in prep taxonomic diversity manuscript.


```{r setup}
library(tidyverse)
library(sp)
library(raster)
#library(rgeos)
library(rgbif)
library(viridis)
library(gridExtra)
library(rasterVis)
library(concaveman)
library(sf)
library(cowplot)
library(data.table)
set.seed(1)

```

Pull in compiled and cleaned data from FishGlob downloaded on November 28 2022 (V 1.5). This is typically compiled by Dr. Aurore Maureaud. This includes public and private data and therefore link cannot be shared. However with editing you can run analyses for public trawl surveys.

|Survey code|Survey name short|Survey name long|Agency|Region|Access|Provider/link to access|Inclusion
|-----------|-----------|----------|-----------|-----------|----------|----------|----------|
|AI  |Aleutian Islands|Aleutian Islands|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|BITS-1  |Baltic Sea Q1|Baltic Sea Quarter 1|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|BITS-4  |Baltic Sea Q4|Baltic Sea Quarter 4|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|CHL  |Chile|Chile|Universidad de Concepción, Chile|South America|Requires data request|Daniela Yepson daniela.yepsen@gmail.com and Luis Cubillos lucubillos@gmail.com|Included|
|COL| Colombia| Colombian Caribbean|Universidad Nacional de Colombia|South America|Requires data request|Camilo B. Garcia cbgarciar@unal.edu.co|Too few years|
|DFO-HS  |Hecate Strait|Hecate Strait|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/780a1c02-1f9c-4994-bc70-a0e9ef8e3968 and OceanAdapt: https://zenodo.org/records/8103080|Too few years|
|DFO-NF  |Newfoundland|Newfoundland|Department of Fisheries and Oceans| Canada|Requires data request|Mariano Koen-Alonso mariano.koen-alonso@dfo-mpo.gc.ca|Included|
|DFO-QCS  |Queen Charlotte Sound|Queen Charlotte Sound|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/a278d1af-d567-4964-a109-ae1e84cbd24a and OceanAdapt: https://zenodo.org/records/8103080|Included|
|DFO-SOG  |Strait of Georgia|Straight of Georgia|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/d880ba18-8790-41a2-bf73-e9247380759b and OceanAdapt: https://zenodo.org/records/8103080| Too few years|
|DFO-WCHG  |West Coast Haida Gwaii|West Coast Haida Gwaii|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/5ee30758-b1d6-49fe-8c4e-5136f4b39ad1 and OceanAdapt: https://zenodo.org/records/8103080| Too few years|
|DFO-WCVI  |West Coast Vancouver Island|West Coast Vancouver Island|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/557e42ae-06fe-426d-8242-c3107670b1de and OceanAdapt: https://zenodo.org/records/8103080| Too few years|
|EBS  |Eastern Bering Sea|Eastern Bering Sea|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|EVHOE  |Bay of Biscay|Bay of Biscay|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|FALK  |Falkland Islands|Falkland Islands|Falkland Islands Fisheries Department|Southern Ocean|Requires data request|Alexander Arkhipkin aarkhipkin@fisheries.gov.fk and Jorge Ramos jeramos@fisheries.gov.fk| Excluded after spatial temporal standardization in next script|
|FR-CGFS  |English Channel|English Channel|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|GIN  |Guinea|Guinea|National Center of Fisheries Sciences of Boussoura, Conakry, Republic of Guinea|Africa|Requires data request|Mohammed Lamine Camara mlcamara.kennedy@gmail.com|Inconsistent sampling through space and time|
|GMEX-Summer  |Gulf of Mexico Summer|Gulf of Mexico Summer|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|GMEX-Fall  |Gulf of Mexico Fall|Gulf of Mexico Fall|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|GOA  |Gulf of Alaska|Gulf of Alaska|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|GRL-DE  |Greenland|Greenland|Thuenen Institute of Sea Fisheries|Europe|Requires data request|Karl-Michael Werner karl-michael.werner@thuenen.de|Included|
|GSL-N  |N Gulf of St. Lawrence|Northern Gulf of St. Lawrence|Department of Fisheries and Oceans|Canada|Public|See OceanAdapt: https://zenodo.org/records/8103080 for specific DFO links|Included|
|GSL-S  |S Gulf of St. Lawrence|Southern Gulf of St. Lawrence|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/1989de32-bc5d-c696-879c-54d422438e64 and OceanAdapt: https://zenodo.org/records/8103080|Included|
|ICE-GFS  |Iceland|Iceland|Marine and Freshwater Research Institute, Iceland|Europe|Requires data request|Jón Sólmundsson jon.solmundsson@hafogvatn.is|Included|
|IE-IGFS  |Irish Sea|Irish Sea|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|IS-TAU| Israel | Israel| Tel Aviv University|Asia|Requires data request| Jonathan Belmaker jonathan.belmaker@gmail.com|Too few years|
|IS-MOAG|Israel|Israel|Israeli Ministry of Agriculture|Asia|Requires data request|Oren Sonin orens@moag.gov.il and Dori Edelist blackreefs@gmail.com|Inconsistent sampling through space and time|
|MEDITS  |Mediterranean|Mediterranean|Multiple|Europe|Requires data request|Contact corresponding author for contacts|Included|
|MRT|Mauritania|Mauritania|Institut Mauritanien de Recherches Océanographiques et des Pêches, Nouadhibou, Mauritania|Africa|Requires data request|Beyah Meissa bmouldhabib@gmail.com|Inconsistent sampling through space and time|
|NAM  |Namibia|Namibia|National Marine Information and Research Centre, Ministry of Fisheries and Marine Resources, Namibia|Africa|Requires data request|Johannes Kathena john.kathena@mfmr.gov.na|Included|
|NEUS-Fall  |NE US Fall|Northeast USA Fall|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|NEUS-Spring  |NE US Spring|Northeast USA Spring|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|NIGFS-1  |N Ireland Q1|North Ireland Quarter 1|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|NIGFS-4  |N Ireland Q4|North Ireland Quarter 4|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|Nor-BTS-3  |Barents Sea Norway Q3|Barents Sea Norway Q3|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|NS-IBTS-1  |N Sea Q1|North Sea Quarter 1|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|NS-IBTS-3  |N Sea Q3|North Sea Quarter 3|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|NZ-CHAT  |Chatham Rise NZ|Chatham Rise New Zealand|National Institute of Water and Atmospheric Research Limited, New Zealand| Oceania| Requires data request|Richard O'Driscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz|Included|
|NZ-ECSI  |E Coast S Island NZ|East Coast South Island New Zealand|National Institute of Water and Atmospheric Research Limited, New Zealand| Oceania| Requires data request|Richard O'Driscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz|Included|
|NZ-SUBA  |Sub-Antarctic NZ|Sub-Antarctic New Zealand|National Institute of Water and Atmospheric Research Limited, New Zealand| Oceania| Requires data request|Richard O'Driscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz|Included|
|NZ-WCSI  |W Coast S Island NZ|West Coast South Island New Zealand|National Institute of Water and Atmospheric Research Limited, New Zealand| Oceania| Requires data request|Richard O'Driscoll richard.odriscoll@niwa.co.nz and Fabrice Stephenson fabrice.stephenson@waikato.ac.nz|Included|
|PT-IBTS  |Portugal|Portugal|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|ROCKALL  |Rockall Plateau|Rockall Plateau|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|S-GEORG  |S Georgia|South Georgia|British Antarctic Survey|Southern Ocean|Requires data request|Mark Belchier mark.belchier@gov.gs and Martin Collins macol@bas.ac.uk|Included|
|SCS-Fall  |Scotian Shelf Fall|Scotian Shelf Summer|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/1366e1f1-e2c8-4905-89ae-e10f1be0a164 and OceanAdapt: https://zenodo.org/records/8103080|Too few years|Included|
|SCS-SPRING  |Scotian Shelf Spring|Scotian Shelf Spring|Department of Fisheries and Oceans| Canada|Public|https://open.canada.ca/data/en/dataset/fecf045a-95a2-4b69-8a40-818649a62716 and OceanAdapt: https://zenodo.org/records/8103080|Too much data loss after spatial temporal standardization|
|SCS-SUMMER  |Scotian Shelf Summer|Scotian Shelf Summer|Department of Fisheries and Oceans|Canada|Public|https://open.canada.ca/data/en/dataset/1366e1f1-e2c8-4905-89ae-e10f1be0a164 and OceanAdapt: https://zenodo.org/records/8103080|Included|
|SEUS-fall  |SE US Fall|Southeast USA Fall|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|SEUS-spring  |SE US Spring|Southeast USA Spring|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|SEUS-summer  |SE US Summer|Southeast USA Summer|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|SWC-IBTS-1  |Scotland Shelf Sea Q1|Scotland Shelf Sea Quarter 1|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|SWC-IBTS-4  |Scotland Shelf Sea Q4|Scotland Shelf Sea Quarter 4|International Council for the Exploration of the Sea|Europe|Public|https://datras.ices.dk/Data_products/Download/Download_Data_public.aspx|Included|
|WBLS| Western Black Sea| Western Black Sea|Institute of Fish Resources, Bulgaria|Europe|Requires data request|Elitsa Petrova (elitssa@yahoo.com), Feriha Tserkova & Vesselina Mihneva| Too few years|
|WCANN  |W Coast US|West Coast USA|National Oceanic and Atmospheric Administration| USA|Public|DisMAP: https://apps-st.fisheries.noaa.gov/dismap/ and OceanAdapt: https://zenodo.org/records/8103080|Included|
|ZAF-ATL  |Atlantic Ocean ZA|Atlantic Ocean South Africa|Department of Forestry, Fisheries and the Environment, South Africa|Africa|Requires data request| Tracey Fairweather traceyf@daff.gov.za|Included|
|ZAF-IND  |Indian Ocean ZA|Indian Ocean South Africa|Department of Forestry, Fisheries and the Environment, South Africa|Africa|Requires data request| Tracey Fairweather traceyf@daff.gov.za|Included|




```{r pull in fishglob database}

FishGlob_1.5 <- fread(here::here("data","FISHGLOB_v1.5_clean.csv"))

```

This version of FishGlob leaves out seasons for GMEX, fix here

```{r add season to GMEX}
#add season to GMEX to survey unit

FishGlob_1.5[survey == "GMEX", survey_unit := paste0(survey,"-",season)]
```

Also adding in seasons for NIGFS

```{r add season to NIGFS}
#add season to GMEX to survey unit

FishGlob_1.5[survey == "NIGFS", survey_unit := paste0(survey,"-",quarter)]
```

ZAF (South Africa) has distinct Atlantic and Indian surveys (split  at ~20.01˚ E, Cape Agulhas)

```{r add longitudinal region to ZAF}
FishGlob_1.5[survey == "ZAF" & longitude <20.01, survey_unit := "ZAF-ATL"][survey == "ZAF" & longitude >= 20.01, survey_unit := "ZAF-IND"]
```

Region names
```{r}
sort(unique(FishGlob_1.5[,survey_unit]))
```
##Data Replacements
####Greenland (version in FishGlob 1.5 is missing lengths and therefore biomass values)
This version was obtained directly from Karl-Michael Werner [karl-michael.werner@thuenen.de](karl-michael.werner@thuenen.de) who now manages the Greenland survey September 2023. He is based in Germany.

```{r}
#greenland <- 

```

####Norway
Prepped by Laurene Pecuchet (U Trömso, Norway) September 2023 to replace what's in FishGlob 1.5 because IMR "are quite concerned that FishGlob, and other studies, have been using a "flawed" multi-surveys dataset that is available in NMDC (data portal of IMR). Turns out that this dataset was put publicly by miscommunication on NMDC after one published paper in Scientific Reports, and I think they only realized the existence of this dataset just the last year as some papers are coming out using it (especially the one from Cesc Gordo-Vilaseca in PNAS https://www.pnas.org/doi/10.1073/pnas.2120869120). They are now trying to make some damage controls to make sure that this dataset is not used ever again in the future, but that cleanded and standardised datasets of the Barents Sea survey that are publicly available in NMDC are used instead of.

September 14: From Laurene, "I send you in attachment the “new” IMR survey formatted for Fishglob. I have done some small check of the dataset, and so far everything looks good, but I didn’t do a deep check yet, but I don’t see why there should be any problems with it....For your study, I think it is also important that you know that there has been some inconsistencies in taxonomic descriptions in the Barents Sea so that some species should be considered at the genus level instead of for biodiversity analysis, I send you in attach an excel (Barents Sea Fish Reference List.csv) file that summarize which species might be a misidentification and which one should be considered and merged." All of these files now live in "data/Norway_Sep2023"

Helpful guidance from here: https://www.hi.no/en/hi/nettrapporter/rapport-fra-havforskningen-en-2021-15
- "2.2.5 - Recommended adjustments to the output before analysis
Eelpouts and liparids. When combing years, we recommend that all records of eelpouts (Zoarcidae) are pooled to the family level, because they are notoriously difficult to identify (see Appendix 3). The same apply to liparids (Liparidae). If species level data of these families are used, consider excluding data from 2004-2006/2007. These years the staff on some of the Norwegian vessels were inexperienced, and proper identification keys for arctic species were lacking (compare for instance catches of Lycodes frigidus and Lycodes eudipleurostictus in the first years to the later years, Appendix 3). If species level data of these families are used, records to family levels should be removed or else these will be treated as a separate species in the further analysis of the data. Both Zoarcidae and Liparidae have unresolved taxonomy for some genera, therefore we have chosen to pool all liparids of the genus Careproctus and all eelpouts of the genus Gymnelus in the output. Sebastes. The column " Sebastes spp." contains mainly juvenile redfish. Small specimens are very difficult to identify so the protocol is to identify only individuals larger than 10 cm to the species level. Before analysis, all redfish ( S . mentella , S. norvegicus, S. viviparus and Sebastes spp .) should be pooled, or Sebastes spp. should be removed – if not it will be treated as a separate species in the analysis . Records in Appendix 2. The records of the S. viviparus west of Svalbard(Spitsbergen) are unreliable and should be removed if Sebastes data are kept at the species level (Appendix 2). Species verified for the Barents Sea, but outliers in terms the normal depth range, distribution area within the Barents Sea, size etc. were coded as questionable in the data base (Appendix 2) and should be removed before analysis. Consider also removing pelagic species (e.g. capelin and herring), as these are poorly sampled by the bottom trawl. The data should be standardised with towing distance before analysis."

Therefore, we will:
- Remove all records of eelpouts and liparids (Family = Zoarcidae or Liparidae) (as we only include species ID'd to species)
- Remove redfish (Genus = Sebastes)

```{r norway data}

#load Norwegian data
load(here::here("data","Norway_Sep2023","NOR-BTS_clean.RData"))
norway_clean <- data.table(data)

#remove observations without dates
norway_clean <- norway_clean[complete.cases(norway_clean[,.(month)]),]

#remove species records in accordance with recommendation from HI
norway_clean <- norway_clean[!(family %in% c("Zoarcidae","Liparidae") | genus == "Sebastes"),]

#some column names don't match fishglob (fishglob = num, num_h, num_cpue, wgt, wgt_h, wgt_cpue; norway = num, num_cpue (number of ind./hour), num_cpua (number of ind./km2), wgt, wgt_cpue (kg/min), wgt_cpua(kg/km2)  )
#also, some column units in the readme are in correct. Therefore, I will generate _cpue and _h values here
# we will need to check  and rename columns
setnames(norway_clean, c("haul_dur"), c("haul_dur_m"))
norway_clean[,haul_dur := haul_dur_m/60] #haul duration currently in minutes, need hours
norway_clean[,num_h := num/haul_dur][,num_cpue := num/area_swept][,wgt_h := wgt/haul_dur][,wgt_cpue := wgt/area_swept]

#change some columns to numeric
cols = c("month","day")
norway_clean[,(cols) := lapply(.SD,as.numeric),.SDcols = cols]

#also, delete source and timestamp
fishglob_colnames <- colnames(FishGlob_1.5)
norway_clean <- norway_clean[,..fishglob_colnames]

norway_clean[survey == "Nor-BTS" & month %in% c(1:6), survey_unit := "Nor-BTS-1"][survey == "Nor-BTS" & month %in% c(7:12), survey_unit := "Nor-BTS-3"]

#Overlap between IBTS and Nor-BTS surveys below 62˚latitude, so delete all hauls that occur below 62˚latitude
norway_clean <- norway_clean[latitude  >= 62,]

```


Delete Greenland and Norway
```{r}
FishGlob_1.5 <- FishGlob_1.5[!(survey %in% c("Nor-BTS"
                                             #,
                                             #"GRL-DE" #ignore greenland for now...
                                             ))]
```


Add in updated Greenland and Norway data
```{r}
FishGlob_1.5 <-rbind(FishGlob_1.5,norway_clean)
#FishGlob_1.5 <-rbind(FishGlob_1.5,greenland)
```


##Preliminary Data Cuts
###Specific Regional Changes Before Cutting to 10 years only

*GSL*
- North: we have data 1980-2019, but gear changes in 2004/2005, so let's use later portion (more consistent months of sampling; 2005-2019; 15 years) 
- South: we have data 1970-2019, but gear/vessel changes in 1985 and again in 1992, so again let's use later portion (1992-2019; 27 years)
- See [this github issue](https://github.com/AquaAuma/fishglob/issues/72)

```{r GSL fixes}
#identify haul_ids of hauls we should remove from GSL surveys
haul_ids_to_remove_GSL <- unique(FishGlob_1.5[(survey == "GSL-N" & year < 2005)|(survey == "GSL-S" & year < 1992),haul_id])

FishGlob_1.5 <- FishGlob_1.5[!(haul_id %in% haul_ids_to_remove_GSL),] #remove hauls before consistent gear/vessel was used
```

*SGEORG*
- From Martin Collins, "Most surveys were focused on demersal fish on the South Georgia shelf (< 350 m), but surveys in 2003, 2010 and 2019 had some deeper trawls.  The deeper trawls caught very different fish, so are unlikely to be of use to a long-term analysis, but I have left them in."

-Delete all trawls deeper than 350 M
```{r}
#identify haul_ids of hauls we should remove from GSL surveys
haul_ids_to_remove_SGEORG <- unique(FishGlob_1.5[(survey == "SGEORG" & depth >350),haul_id])

FishGlob_1.5 <- FishGlob_1.5[!(haul_id %in% haul_ids_to_remove_SGEORG),] #remove hauls before consistent gear/vessel was used
```

*NZ-CHAT*
-bump december observations to next year because observations occur in 12,1,2
```{r}
#bump observations forward
FishGlob_1.5[survey == "NZ-CHAT" & month == 12,  year := year+1, ]
```


###Because time is an essential component of these analyses, we will get rid of any survey x season combinations that are not sampled for at least 10 years

```{r summary by survey region}
#new row for total number of years sampled
FishGlob_1.5[,years_sampled := length(unique(year)),.(survey_unit)]

summary(FishGlob_1.5$years_sampled) #ranges from 2 (DFO Straight of Georgia) to 57 (Northeast US)
View(unique(FishGlob_1.5[,.(survey_unit, years_sampled)]))

#statistics about full dataset
nrow(FishGlob_1.5) 
length(unique(FishGlob_1.5[,survey])) 
length(unique(FishGlob_1.5[,survey_unit])) 

#remove observations for any regions x season combinations sampled less than 10 times
FishGlob.10year <- FishGlob_1.5[years_sampled >= 10,]

#statistics about reduced 10 year dataset
nrow(FishGlob.10year) 
length(unique(FishGlob.10year[,survey])) 
length(unique(FishGlob.10year[,as.character(survey_unit)])) 

#remove full database
rm(FishGlob_1.5)


```

###For taxonomic analyses, resolution to species is required. Therefore, we will  exclude any observations not resolved to species. 

```{r spp ID only}
#month a number
FishGlob.10year[,month := as.numeric(month)]

FishGlob.10year.spp <- FishGlob.10year[rank %in% c("Species", "Subspecies"),] #3869384 total observations

#remove full species database
rm(FishGlob.10year)

#vector with all survey names
all_survey_units <- sort(unique(FishGlob.10year.spp[,survey_unit]))

#calculate # species per year
FishGlob.10year.spp_survey_year <- unique(FishGlob.10year.spp[,.(survey_unit, year, accepted_name)])

FishGlob.10year.spp_survey_year[,spp_count_survey_year := uniqueN(accepted_name),.(survey_unit, year)]

FishGlob.10year.spp_survey_year.r <-unique(FishGlob.10year.spp_survey_year[,.(survey_unit,  year, spp_count_survey_year)])

nrow(FishGlob.10year.spp_survey_year.r)

#calculate # hauls per year
FishGlob.10year.spp_haulid_year <- unique(FishGlob.10year.spp[,.(survey_unit, year, haul_id)])

FishGlob.10year.spp_haulid_year[,haulid_count_survey_year := uniqueN(haul_id),.(survey_unit, year)]

FishGlob.10year.spp_haulid_year.r <-unique(FishGlob.10year.spp_haulid_year[,.(survey_unit,  year, haulid_count_survey_year)])

nrow(FishGlob.10year.spp_haulid_year.r)

```


##Visually Inspect Distribution of Data Through Time and Space

##Spatial and Temporal Patterns in All Trawl Surveys

Let's look at the number of hauls per year/month and year/quarter and year/season visually

```{r hauls per year, month, quarter}
#unique survey, survey_unit, year, month, quarter, season, haul_id, lat, lon
FishGlob.10year.uniquehauls <- unique(FishGlob.10year.spp[,.(survey, survey_unit, year,month,quarter,season,haul_id, latitude, longitude,haul_dur)])

#add column with adjusted longitude for few surveys that cross dateline (NZ-CHAT and AI)
FishGlob.10year.uniquehauls[,longitude_adj := ifelse((survey_unit %in% c("AI","NZ-CHAT") & longitude > 0),longitude-360,longitude)]

FishGlob.10year.uniquehauls[,haul_counts_per_survey_season_month :=uniqueN(haul_id),.(survey, month, season)][, #count # hauls per survey, season, and month
                     haul_counts_per_survey_quarter_month :=uniqueN(haul_id),.(survey, month, quarter)][,#count # hauls per survey, month, and quarter
                     total_hauls_survey :=uniqueN(haul_id),.(survey)][,#count # hauls per survey in all years
                                                        
              #proportion of hauls for each survey, season, and month divided by total # over all years
                     haul_proportion_survey_season :=haul_counts_per_survey_season_month/total_hauls_survey][,
              #proportion of hauls for each survey, quarter, and month divided by total # over all years
                     haul_proportion_survey_quarter :=haul_counts_per_survey_quarter_month/total_hauls_survey][,
                                                                                                               
                     haul_count_per_survey_year_month :=uniqueN(haul_id),.(year, survey_unit, month)][, #count # hauls per survey unit, year, and month
                     total_hauls_survey_year := uniqueN(haul_id),.(survey_unit,year)][, #count total # hauls per survey unit and year
                     #proportion of hauls for each survey unit and month divided by total # hauls within a survey unit within a year
                     haul_proportion_month_yearly := haul_count_per_survey_year_month/total_hauls_survey_year][, 

                     haul_count_per_survey_year_quarter :=uniqueN(haul_id),.(year, survey_unit, quarter)][, #count # hauls per survey unit, year, and month
                     #proportion of hauls for each survey unit and month divided by total # hauls within a survey unit within a year
                     haul_proportion_quarter_yearly := haul_count_per_survey_year_quarter/total_hauls_survey_year] 

FishGlob.10year.uniquehauls.season <- unique(FishGlob.10year.uniquehauls[,.(survey, survey_unit, month, season, haul_counts_per_survey_season_month,total_hauls_survey, haul_proportion_survey_season)]) #relative sampling by season across all years

FishGlob.10year.uniquehauls.quarter <- unique(FishGlob.10year.uniquehauls[,.(survey,survey_unit , month, quarter, haul_counts_per_survey_quarter_month,total_hauls_survey, haul_proportion_survey_quarter)]) #relative sampling by quarter across all years

FishGlob.10year.uniquehauls.annual.month <- unique(FishGlob.10year.uniquehauls[,.(survey, year, survey_unit, month, haul_count_per_survey_year_month,total_hauls_survey_year,haul_proportion_month_yearly)]) #relative sampling by month within years

FishGlob.10year.uniquehauls.annual.quarter <- unique(FishGlob.10year.uniquehauls[,.(survey, year, survey_unit, quarter, haul_count_per_survey_year_quarter,total_hauls_survey_year,haul_proportion_quarter_yearly)]) #relative sampling by month within years

#how does #hauls vary with season and month?
survey_season_month_hauls <- ggplot(FishGlob.10year.uniquehauls.season) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  facet_wrap(~survey,scales = "free_y") +
  theme_classic()

ggsave(survey_season_month_hauls, filename = "survey_season_month_hauls.pdf",path = here::here("figures","view_data"), height = 5, width = 15, units = "in")

#how does #hauls vary with quarter and month?
survey_quarter_month_hauls <- ggplot(FishGlob.10year.uniquehauls.quarter) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  facet_wrap(~survey,scales = "free_y") +
  theme_classic()

ggsave(survey_quarter_month_hauls, filename = "survey_quarter_month_hauls.pdf",path = here::here("figures","view_data"), height = 5, width = 15, units = "in")

#how does #hauls vary with year and month?
year_survey_month_hauls <- ggplot(FishGlob.10year.uniquehauls.annual.month) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  facet_wrap(~survey_unit,scales = "free_y") +
  theme_classic()

ggsave(year_survey_month_hauls, filename = "year_survey_month_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
ggsave(year_survey_month_hauls, filename = "year_survey_month_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")

#how does #hauls vary with year and month?
year_survey_quarter_hauls <- ggplot(FishGlob.10year.uniquehauls.annual.quarter) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  facet_wrap(~survey_unit,scales = "free_y") +
  theme_classic()

ggsave(year_survey_quarter_hauls, filename = "year_survey_quarter_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
ggsave(year_survey_quarter_hauls, filename = "year_survey_quarter_hauls.pdf",path = here::here("figures","view_data"), height = 8, width = 16, units = "in")
```

Now, let's look at how location of sampling varies by month of sampling and year of sampling 

```{r location by year plots}
location_by_year <- ggplot(FishGlob.10year.uniquehauls) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  facet_wrap(~survey_unit, scales = "free") +
  theme_classic()

ggsave(location_by_year, filename = "location_by_year.pdf",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_year, filename = "location_by_year.jpg",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_year, filename = "location_by_year.eps",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")
```


```{r location by month plots}
(location_by_month <- ggplot(FishGlob.10year.uniquehauls) +
  geom_point(aes(x = longitude_adj, y = latitude, color = as.numeric(month)), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  facet_wrap(~survey_unit, scales = "free") +
  theme_classic())

ggsave(location_by_month, filename = "location_by_month.pdf",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_month, filename = "location_by_month.jpg",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")

ggsave(location_by_month, filename = "location_by_month.eps",path = here::here("figures","view_data"), height = 8, width = 12, units = "in")
```


##Region Specific Data Processing

-Fredston et al. 2022 Nature and Batt et al. 2017 Ecology Letters informed North American data processing
-Personal communication with Aurore Maureaud and Laurene Pecuchet re: work by A. Maureaud, L. Pecuchet and R. Frelat and the supplementary material for Maureaud et al. 2019 Proceedings of the Royal Society B: Biological Sciences informed European data processing
-Additional data processing informed by data itself, and by FishGlob pdf summary documents
-limit to max 3 months for each survey unit, representative of a 'season' (exception = West Coast USA where all 4 months sampled consistently)

####"AI"
```{r AI visual}
ggplot(FishGlob.10year.uniquehauls.season[survey == "AI",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey == "AI",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey == "AI",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey == "AI",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey == "AI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey == "AI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "AI",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "AI",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```
- Most hauls in 6,7,8
- Seemingly consistent spatial distribution through time
- No dramatic changes in spp richness 
```{r AI processing}
ai_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "AI" & month %in% c(6:8),haul_id])
```


####BITS
(We have two surveys for BITS, quarter 1 and quarter 4)
BITS 1

From Fredston et al. 2023, every year after 2000 has >400 hauls and most of the earlier years are <50 

```{r  BITS1 visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "BITS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "BITS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "BITS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```
-Keep both months (2,3)
-Seemingly consistent spatial distribution through time
-Consistent # of species and # hauls after 2000
```{r BITS1 processing}
bits1_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "BITS-1" & month %in% c(2,3) & year > 2000,haul_id])
```

BITS4
```{r  BITS4 visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "BITS-4",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "BITS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "BITS-4",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "BITS-4",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```

-Keep (10,11,12)
-Start in 2000 (starts in 1996, but gap in 1997 and 1998, and 1996 all in December; also spp richness in first survey very low; consistent # of hauls after 2000)
-Seemingly consistent spatial distribution through time

```{r BITS4 processing}
bits4_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "BITS-4" & month %in% c(10:12) & year > 2000,haul_id])
```


####CHL (Chile)

```{r  CHL visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "CHL",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "CHL",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "CHL",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "CHL",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "CHL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "CHL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "CHL",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "CHL",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```
-Keep (7,8,9)
-Seemingly consistent spatial distribution through time
-No major changes in spp richness through time
-No major changes in # hauls through time

```{r CHL processing}
chl_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "CHL" & month %in% c(7:9),haul_id])
```



####DFO-NF


```{r  DFO-NF visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "DFO-NF",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "DFO-NF",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "DFO-NF",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```
-Keep (10,11,12)
-Seemingly consistent spatial distribution through time
-No major changes in spp richness through time
-No major changes in haulid through time

```{r DFO-NF processing}
dfo_nf_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "DFO-NF" & month %in% c(10:12),haul_id])
```


####DFO-QCS

```{r  DFO-QCS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "DFO-QCS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "DFO-QCS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "DFO-QCS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```
-Keep (7,8)
-Seemingly consistent spatial distribution through time
-No major changes in richness over time
-No major changes in #hauls

```{r DFO-QCS processing}
dfo_qcs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "DFO-QCS" & month %in% c(7,8),haul_id])
```



####EBS

-Sampling years prior to 1984 (data begin in 1982) were excluded from analysis due to large apparent increases in the number of species recorded in the first two years. (Batt et al. 2017)

```{r  EBS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "EBS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "EBS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "EBS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "EBS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EBS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EBS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "EBS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "EBS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()
```

-Keep (6,7,8)
-Seemingly consistent spatial distribution through time
-Per Batt et al. 2017, limit to >= 1984
-No clear changes  in richness through time
-No clear changes in # hauls through time

```{r EBS processing}
ebs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "EBS" & month %in% c(6,7,8) & year >= 1984,haul_id])
```


####EVHOE

```{r  EVHOE visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "EVHOE",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EVHOE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "EVHOE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "EVHOE",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "EVHOE",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep (10,11,12)
-Seemingly consistent spatial distribution through time
-Very low sampling in 2017 (and also low richness), exclude this year

```{r EVHOE processing}
evhoe_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "EVHOE" & month %in% c(10,11,12) & year != 2017 ,haul_id])
```


####FALK (excluded from final dataset)
```{r FALK visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "FALK",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "FALK",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "FALK",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "FALK",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FALK",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FALK",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "FALK",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "FALK",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```
-Keep February (2) only from 2004 onward (most consistent sampling)
-Inconsistent spatial distribution through time, but this will be fixed in next step with spatial standardization


```{r FALK processing}
falk_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "FALK" & month %in% c(2) & year >= 2004, haul_id])
```


####FR-CGFS

```{r  FR-CGFS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "FR-CGFS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "FR-CGFS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "FR-CGFS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```
-Keep 9,10,11
-Consistent spatial distribution through time
-Seemingly consistent richness through time
-Seeemingly consistent #hauls through time


```{r FR-CGFS processing}
fr_cgfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "FR-CGFS" & month %in% c(9,10,11), haul_id])
```

####GIN (excluded from final dataset)

```{r  GIN visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GIN",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GIN",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GIN",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GIN",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GIN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GIN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GIN",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GIN",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Exclude this region, no consistent sampling through time

```{r GIN processing}
gin_hauls_keep <- NULL
```

####GMEX
-In the Gulf of Mexico, we restricted our analysis to data from 1984 - 2000 (full range  1982-2014); if all years had been used, the number of sites sampled in at least 85% of years  would drop from 39 to 13. (Batt et al. 2017)

GMEX Fall 
```{r  GMEX Fall visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GMEX-Fall",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GMEX-Fall",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GMEX-Fall",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 9,10,11
-Inconsistent spatial distribution through time, will restrict to <-87.5 longitude
-Seemingly consistent richness through time
-Seeemingly consistent #hauls through time


```{r GMEX-Fall processing}
gmex_fall_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Fall" & month %in% c(9,10,11) & longitude_adj < -87.5, haul_id])
```

GMEX Summer
```{r  GMEX Summer visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GMEX-Summer",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GMEX-Summer",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GMEX-Summer",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep months 5,6,7
-In consistent spatial distribution through time, but this will be fixed in spatial standardization step
-Seemingly consistent richness before 2008 and 2008 onward through time
-Seeemingly consistent #hauls through time
-Jump from 2007 to 2008, when spatial footprint increases, so I will only use data from before 2008

```{r GMEX-Summer processing}
gmex_summer_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GMEX-Summer" & month %in% c(5,6,7) & year <2008, haul_id])
```

####GOA
```{r GOA visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GOA",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GOA",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GOA",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GOA",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GOA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GOA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GOA",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GOA",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep months 6,7,8
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent #hauls through time

```{r GOA processing}
goa_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GOA" & month %in% c(6,7,8), haul_id])
```

####GRL-DE
-From Beukhof et al. 2019, all surveys in October and November
```{r GRL-DE visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GRL-DE",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GRL-DE",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GRL-DE",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-No months in data set, but according to Beukhof et al. 2019, all sampling in October and November so keep all 
-Consistent spatial distribution through time
-Seemingly consistent richness
-# of hauls drops between 1991 and 1992, and both 1992 and 2017 so limit to years between (1993-2016)

```{r GRL-DE processing}
grl_de_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GRL-DE" & year %in% c(1993:2016), haul_id])
```

####GSL

GSL-N
```{r GSL-N visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GSL-N",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-N",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-N",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GSL-N",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GSL-N",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 6,7,8
-Consistent spatial distribution through time
-Seemingly consistent richness
-# of hauls in 2005 is higher, so start in 2006

```{r GSL-N processing}
gsl_n_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GSL-N" & year > 2005, haul_id])
```

GSL-S
```{r GSL-S visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "GSL-S",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-S",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "GSL-S",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "GSL-S",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "GSL-S",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 8,9,10
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r GSL-S processing}
gsl_s_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "GSL-S" & month %in% c(8:10), haul_id])
```

####ICE-GFS

```{r ICE-GFS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ICE-GFS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ICE-GFS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ICE-GFS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 2,3,4
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r ICE-GFS processing}
ice_gfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ICE-GFS" & month %in% c(2:4), haul_id])
```

####IE-IGFS

```{r IE-IGFS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "IE-IGFS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "IE-IGFS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "IE-IGFS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 10,11,12
-Consistent spatial distribution through time after 2004 (sampled far east in 2003 and 2004)
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r IE-IGFS processing}
ie_igfs_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "IE-IGFS" & month %in% c(10:12) & year  > 2004, haul_id])
```

####IS-MOAG (excluded from final dataset)
```{r IS-MOAG visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "IS-MOAG",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IS-MOAG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "IS-MOAG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "IS-MOAG",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "IS-MOAG",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Sampling too scattered over time, excluding

```{r IS-MOAG processing}
is_moag_hauls_keep <- NULL
```

####MEDITS
```{r MEDITS visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "MEDITS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MEDITS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MEDITS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "MEDITS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "MEDITS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep  all surveys in quarter 2
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r MEDITS processing}
medits_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "MEDITS", haul_id])
```


####MRT (excluded from final dataset)
```{r MRT visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "MRT",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "MRT",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "MRT",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "MRT",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MRT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "MRT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "MRT",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "MRT",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Sampling inconsistent, exclude completely

```{r MRT processing}
mrt_hauls_keep <- NULL
```

####NAM

```{r NAM visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NAM",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NAM",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NAM",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NAM",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NAM",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NAM",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NAM",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NAM",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep surveys in 1 and 2 (most consistently sampled)
-Consistent spatial distribution through time
-Seemingly consistent richness except for 1998 (exclude)
-Seemingly consistent number of hauls except for 1998 (exclude)

```{r NAM processing}
nam_hauls_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NAM" & month %in% c(1,2) & year != 1998, haul_id])
```


####NEUS


NEUS Spring
```{r NEUS-Spring visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NEUS-Spring",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NEUS-Spring",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NEUS-Spring",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 3,4,5 months
-Inconsistent spatial distribution through time, but should be caught in standardization step
-Seemingly consistent richness (especially after 87, should be fixed with standardization step)
-Seemingly consistent number of hauls (especially after 81, should be fixed with standardization step)

```{r NEUS-Spring processing}
#calculate wgt_cpue (km^2 avg from sean Lucey) and wgt_h (all biomass values calibrated to standard pre 2009 30 minute tow)
FishGlob.10year.spp[survey == "NEUS", wgt_h := wgt/0.5][survey == "NEUS", wgt_cpue := wgt/0.0384][survey == "NEUS", num_h := num/0.5][survey == "NEUS", num_cpue := num/0.0384]


#also, for northeast, we are going to delete any hauls before 2009 that are outside of +/- 5 minutes of 30 minutes and 2009 forward that are outside of +/- 5 minutes of 20 minutes
neus_spring_keep <- unique(FishGlob.10year.uniquehauls[((survey_unit == "NEUS-Spring" & month %in% c(3:5) & year < 2009 & (haul_dur > 0.42 & haul_dur < 0.58)) |
                                                        (survey_unit == "NEUS-Spring" & month %in% c(3:5) & year >= 2009 & (haul_dur > 0.25  & haul_dur < 0.42))), haul_id])


```

NEUS Fall

```{r NEUS-Fall visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NEUS-Fall",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NEUS-Fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NEUS-Fall",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NEUS-Fall",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 9,10,11 months
-Inconsistent spatial distribution through time, but should be caught in standardization step
-Seemingly consistent richness (especially after 84, should be fixed with standardization step)
-Seemingly consistent number of hauls (especially after 85, should be fixed with standardization step)

```{r NEUS-Fall processing}

#also, for northeast, we are going to delete any hauls before 2009 that are outside of +/- 5 minutes of 30 minutes and 2009 forward that are outside of +/- 5 minutes of 20 minutes
neus_fall_keep <- unique(FishGlob.10year.uniquehauls[((survey_unit == "NEUS-Fall" & month %in% c(9,10,11) & year < 2009 & (haul_dur > 0.42 & haul_dur < 0.58)) |
                                                        (survey_unit == "NEUS-Fall" & month %in% c(9,10,11) & year >= 2009 & (haul_dur > 0.25  & haul_dur < 0.42))), haul_id])
```

####NIGFS
Northern Ireland

Spring Northern Ireland (quarter 1)

```{r NIGFS spring visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NIGFS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NIGFS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NIGFS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 2,3,4 months
-Inconsistent spatial distribution through time, but should be caught in standardization step
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r NIGFS 1 processing}
nigfs_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-1" & month %in% c(2,3,4), haul_id])
```


Spring Northern Ireland (quarter 1)

```{r NIGFS fall visual}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NIGFS-4",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NIGFS-4",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NIGFS-4",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Keep 10,11 months
-Consistent spatial distribution through time, but should be caught in standardization step
-Seemingly consistent richness
-Seemingly consistent number of hauls

```{r NIGFS 4 processing}
nigfs_4_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NIGFS-4" & month %in% c(10,11), haul_id])
```

####Nor-BTS

OG FISHGLOB includes Nor-BTS-1 as well, but this was not shared by L. Pecuchet, and therefore ignored

Nor-BTS-3
```{r Nor-BTS-3}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "Nor-BTS-3",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "Nor-BTS-3",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "Nor-BTS-3",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 8,9,10
-Somewhat consistent spatial distribution through time
-Number of hauls is variable, but no clear years to exclude
-Laurene Pecuchet (U Tromso) told us that only surveys 2004 and onwards work for biodiversity analyses


```{r Nor-BTS-3 processing}
nor_bts_3_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "Nor-BTS-3" & month %in% c(8:10) & year >= 2004, haul_id])
```

####NS-IBTS

NS-IBTS-1
```{r NS-IBTS-1}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NS-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NS-IBTS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NS-IBTS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 1,2,3
-Consistent spatial distribution through time
-Linear increase in richness, cutoff on # hauls more clear
-Linear increase, but somewhat clear break between late 70s and mid-80s, only keep hauls after 1984


```{r NS-IBTS-1 processing}
ns_ibts_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-1" & month %in% c(1:3) & year >= 1984, haul_id])
```

NS-IBTS-3
```{r NS-IBTS-3}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NS-IBTS-3",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NS-IBTS-3",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NS-IBTS-3",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 7,8,9
-Consistent spatial distribution through time
-Consistent richness through time
-Early years lower # hauls, will start at 1998


```{r NS-IBTS-3 processing}
ns_ibts_3_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NS-IBTS-3" & month %in% c(7:9) & year >= 1998, haul_id])
```


####NZ

NZ-CHAT

```{r NZ-CHAT}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-CHAT",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-CHAT",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-CHAT",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 12,1,2 (NOTE THAT THIS NZ-CHAT SURVEY CROSSES YEAR, SO WE ALREADY LUMPED 12 with NEXT year)
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls after 1995


```{r NZ-CHAT processing}


nz_chat_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-CHAT" & month %in% c(12,1,2) & year >= 1995, haul_id])

```

NZ-ECSI

```{r NZ-ECSI}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-ECSI",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-ECSI",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-ECSI",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 4,5,6
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls
-Gap between 1995 and 2005, but we have 10 total years so we'll keep for now


```{r NZ-ECSI processing}
nz_ecsi_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-ECSI" & month %in% c(4,5,6), haul_id])
```

NZ-SUBA

```{r NZ-SUBA}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-SUBA",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-SUBA",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-SUBA",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 11 and 12
-Consistent spatial distribution through time
-Seemingly consistent richness
-Far more hauls in 1990s, these early sampling years will be excluded (start in 2000)


```{r NZ-SUBA processing}
nz_suba_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-SUBA" & month %in% c(11,12) & year >= 2000, haul_id])
```

NZ-WCSI

```{r NZ-WCSI}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "NZ-WCSI",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "NZ-WCSI",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "NZ-WCSI",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 3,4
-Consistent spatial distribution through time
-Seemingly consistent richness
-Linear decrease in # of hauls through time, leave out first two years with highest # hauls (>= 1995)


```{r NZ-WCSI processing}
nz_wcsi_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "NZ-WCSI" & month %in% c(3,4) & year >= 1995, haul_id])
```

####PT-IBTS
PT-IBTS
```{r PT-IBTS}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "PT-IBTS",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "PT-IBTS",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "PT-IBTS",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 9,10,11
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls


```{r PT-IBTS processing}
pt_ibts_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "PT-IBTS" & month %in% c(9,10,11), haul_id])
```

####ROCKALL

```{r ROCKALL}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ROCKALL",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ROCKALL",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ROCKALL",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 8,9
-Consistent spatial distribution through time
-Seemingly consistent richness
-Seemingly consistent number of hauls


```{r ROCKALL processing}
rockall_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ROCKALL" & month %in% c(8,9), haul_id])
```

####S-GEORG

```{r S-GEORG}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "S-GEORG",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "S-GEORG",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "S-GEORG",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 1 and 2
-Consistent spatial distribution through time
-Seemingly consistent richness except for 2003, will be excluded
-Seemingly consistent number of hauls, except for 2012, will be excluded


```{r SGeorge processing}
s_georg_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "S-GEORG" & month %in% c(1,2) & !(year %in% c(2003,2012)), haul_id])
```

####SCS

Spring
```{r SCS-SPRING}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SCS-SPRING",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SCS-SPRING",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SCS-SPRING",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 2,3,4
-Inconsistent spatial distribution through time (northern latitudes only sampled in early years), only include longitudes < -62 and latitudes < 45.5
-Seemingly consistent richness
-Number of hauls is variable, exclude super low and high numbers (1985,1994,2015,2019)


```{r scs_spring processing}
scs_spring_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SCS-SPRING" & month %in% c(2,3,4) & !(year %in% c(1985,1994,2015,2019)) & longitude_adj < -62 & latitude < 45.5, haul_id])
```

SUMMER
```{r SCS-SUMMER}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SCS-SUMMER",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SCS-SUMMER",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SCS-SUMMER",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 6,7,8
-Consistent spatial distribution through time
-Richness increases linearly, not a clear break point, using breakpoint from # of hauls, but will exclude 2010 which has a very high richness
-# Hauls increases linearly from ~120 in 1970 to ~220 in 2020, not a clear breakpoint, but will go with 1986 because there is a jump between 85 and 86
-Gear change in 1983 (Ellingsen et al. 2015)


```{r scs_summer processing}
scs_summer_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SCS-SUMMER" & month %in% c(6,7,8) & year >= 1986 & year != 2010, haul_id])
```


###SEUS


Spring

```{r SEUS-spring}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-spring",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-spring",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-spring",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 4,5,6
-Consistent spatial distribution through time
-Consistent richness through time
-# Hauls low in 1989 and 2018, will exclude

```{r seus_spring processing}
seus_spring_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-spring" & month %in% c(4,5,6) & year != 1989 & year != 2018, haul_id])
```


Summer

```{r SEUS-summer}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-summer",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-summer",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-summer",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 7,8
-Consistent spatial distribution through time
-Richness consistent through time
-# Hauls low in first year, otherwise okay, just exclude first year (1989)

```{r seus_summer processing}
seus_summer_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-summer" & month %in% c(7,8) & year != 1989, haul_id])
```


Fall

```{r SEUS-fall}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SEUS-fall",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SEUS-fall",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SEUS-fall",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 9,10,11
-Consistent spatial distribution through time
-Richness consistent through time
-# Hauls low in first year, otherwise okay, just exclude first year (1989)


```{r seus_fall processing}
seus_fall_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SEUS-fall" & month %in% c(9,10,11) & year != 1989, haul_id])
```


####SWC-IBTS

Scotland Shelf Sea

SWC-IBTS 1

```{r SWC-IBTS-1}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SWC-IBTS-1",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SWC-IBTS-1",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SWC-IBTS-1",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 1,2,3
-Somewhat inconsistent spatial distribution through time, but this should be addressed in spatial standardization procedure 
-Richness consistent through time
-# Hauls consistent except low in 1995, just exclude 1995



```{r swc-ibts-1 processing}
swc_ibts_1_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-1" & month %in% c(1,2,3) & year != 1995, haul_id])
```

SWC-IBTS 4

```{r SWC-IBTS-4}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "SWC-IBTS-4",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "SWC-IBTS-4",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "SWC-IBTS-4",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Use months 10,11,12
-Somewhat inconsistent spatial distribution through time (southern latitudes only sampled in early years), but this should be addressed in spatial standardization procedure 
-Richness consistent through time (especially after mid 90s)
-# Hauls consistent except low before 1995 and low in 2013, exclude these


```{r swc-ibts-4 processing}
swc_ibts_4_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "SWC-IBTS-4" & month %in% c(10,11,12) & year != 1995 & year >= 1995, haul_id])
```

####WCANN


```{r WCANN}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "WCANN",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "WCANN",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "WCANN",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "WCANN",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "WCANN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "WCANN",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "WCANN",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "WCANN",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Here, one exception, will use four months (6,7,8,9) because all sampled consistently, and lower latitude areas sampled later in the summer consistently
-Consistent spatial distribution through time
-Richness consistent through time
-# Hauls consistent through time


```{r wcann processing}
wcann_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "WCANN" & month %in% c(6:9), haul_id])
```

####WCTRI
-Exclude because only 10 years and overlaps somewhat wiith WCANN

```{r wctri processing}
wctri_keep <- NULL
```


####ZAF

ATL
```{r ZAF ATL}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ZAF-ATL",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ZAF-ATL",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ZAF-ATL",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Include 1,2,3
-Consistent spatial distribution through time
-Richness consistent through time
-# Hauls consistent through time after 1991


```{r zaf atl processing}
zaf_atl_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ZAF-ATL" & month %in% c(1:3) & year >= 1991, haul_id])
```


IND
```{r ZAF IND}
ggplot(FishGlob.10year.uniquehauls.season[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = factor(month), y = factor(season), fill = haul_proportion_survey_season),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Season",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.quarter[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = factor(month), y = factor(quarter), fill = haul_proportion_survey_quarter),color = "white") +
  scale_fill_viridis() +
  labs(x = "Month", y = "Quarter",fill = "Proportion of All Survey Hauls in FishGlob") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.month[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = year, y = factor(month), fill = haul_proportion_month_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Month",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls.annual.quarter[survey_unit == "ZAF-IND",]) +
  geom_tile(aes(x = year, y = factor(quarter), fill = haul_proportion_quarter_yearly),color = "white") +
  scale_fill_viridis() +
  labs(x = "Year", y = "Quarter",fill = "Proportion of Annual Hauls") +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = year), size = 0.3, alpha = 0.5) +
  scale_color_viridis() +
  theme_classic()

ggplot(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND",]) +
  geom_point(aes(x = longitude_adj, y = latitude, color = month), size = 0.3, alpha = 0.5) +
  scale_color_viridis(option = "plasma") +
  theme_classic()

# # of species per year??
ggplot(data = FishGlob.10year.spp_survey_year.r[survey_unit == "ZAF-IND",], aes(x = year, y=spp_count_survey_year)) +
  geom_col() +
  theme_classic()

# # of hauls per year
ggplot(data = FishGlob.10year.spp_haulid_year.r[survey_unit == "ZAF-IND",], aes(x = year, y=haulid_count_survey_year)) +
  geom_col() +
  theme_classic()

```

-Include 4,5,6
-Consistent spatial distribution through time
-Richness consistent through time
-# Hauls consistent before 2001, and then also in 2005 and 2009-2010


```{r zaf ind processing}
zaf_ind_keep <- unique(FishGlob.10year.uniquehauls[survey_unit == "ZAF-IND" & month %in% c(4:6) & year %in% c(1985:2001,2005, 2009,2010), haul_id])
```


####Combine all lists that have _keep
```{r combine lists}
#all objects with _keep
list_obj <- ls(pattern = "_keep")

#combine
fishglob_haulids_to_keep <- unlist(lapply(list_obj, get)) #229894 hauls (Started with 278405)

FishGlob.10year.spp_manualclean <- FishGlob.10year.spp[haul_id %in% fishglob_haulids_to_keep,]

#Require latitude and longitude for all observations
FishGlob.10year.spp_manualclean <- FishGlob.10year.spp_manualclean[complete.cases(FishGlob.10year.spp_manualclean[,.(latitude, longitude)])] #check that this works

#another check for # years sampled
#new row for total number of years sampled
FishGlob.10year.spp_manualclean[,years_sampled := length(unique(year)),.(survey_unit)]
View(unique(FishGlob.10year.spp_manualclean[,.(survey_unit, years_sampled)]))

#save
saveRDS(FishGlob.10year.spp_manualclean, file = here::here("data","cleaned","FishGlob.10year.spp_manualclean.rds"))

```


####Some surveys sample through end of year, fix these
-NOTE THAT THIS NZ-CHAT SURVEY CROSSES YEAR, SO LUMP 1 and 2 with previous year
